Okay, hi everyone to another session of our Computer Vision Zoom.
Today there is a demo talk given by Vanessa Wirth.
She is a PhD student at the Department of Visual Computing, sorry Department of Computer
Science Chair of Visual Computing and today's session will be focused on traditional methods
in computer vision from graphics perspective. She will show some demos that are applications
which have applications in the graphics. So I welcome Vanessa and thanks to you again
for doing this demo and the stage is yours.
Yeah, thank you. Hello everybody. As you might have already heard, I'm from the Chair of
Visual Computing and since computer vision is a pretty interesting and training field
nowadays you will see that multiple academic chairs nowadays will get in touch with computer
vision and so am I. So a little bit about me beforehand maybe I'm focusing primarily
on 3D reconstruction. So I started with 3D reconstruction of static objects, for example,
furniture that doesn't really move and nowadays I'm focusing on 3D reconstruction of non-rigidly
moving objects such as us humans, for example, and I'm trying to achieve faster movement
reconstruction basically. So that's what I'm doing nowadays and today you will also hear
a lot about reconstruction of course because that's my speciality and today we are focusing
on traditional methods. So let's start. You already heard something about passive image
acquisition as I saw on the slides. So I'm presenting you today the Metashape program
which is basically a collection of passive algorithms. So just a short recap, you might
heard about the three stages that you need to find correspondences. If you want to reconstruct
something you need to build up a feature descriptor and then you need to sort of match the descriptors
together by nearest neighbor matching or something. So you're matching correspondences basically
and that is also what Metashape does but in a really professional way. I actually did
some digging about Metashape because it's a proprietary program. You don't really actually
know what they are doing but I was searching on the forum what algorithms they use. So
here's a short description of what I know so far. Metashape is a photogrammetry software.
It is primarily used by archaeologists because they of course want to reconstruct things
they see really ancient stuff and so they use this quite often but you can also see
it in the movie or in the game industry. So for example the Unreal Engine uses it, Halo
4 uses it. You can also see it in Mad Max actually because the good thing about this
reconstruction software is you can take a photo of the scene if you find it beautiful
around the world and then you can use it in the movie industry for example. So that is
actually what Mad Max did. They did a reconstruction of the desert as you can see here and those
objects here are all reconstructed. So you can see the scene is reconstructed, the vehicles
are reconstructed and then they just did some blending and lots of graphic stuff to make
it look really really beautiful as you can see in the picture on the bottom. So that's
actually what also happens in the movie industry. And as I already said I did some digging what
they might do in their algorithms. It is a collection of structured promotional algorithms.
You already heard about that and it definitely uses SIFT descriptors. They also say that
to estimate the parameters of the camera in the current image you are seeing when you're
taking a photo they use something that is called bundle adjustment. You might not heard
about that yet but you heard about the single stages that are actually involved in bundle
adjustment because you heard about the fact that you sort of need to find points in the
3D space and you also heard about the fact that you need camera parameters to do the
matching from the 3D points to the 2D pixels. And that is actually what bundle adjustment
does. It does everything as a whole thing. So that's also why it's called bundle adjustment
because it does everything in a bundle. It estimates camera parameters and also the 3D
coordinates of the points that you can see in an image as a whole in a collection. So
that is the algorithm they use. And I also saw that they to get the final 3D surface
they use multi-view stereo because of course they're also taking multiple pictures. You
Presenters
Zugänglich über
Offener Zugang
Dauer
01:03:07 Min
Aufnahmedatum
2021-07-05
Hochgeladen am
2021-07-05 15:57:00
Sprache
en-US
In this session, Vanessa Wirth shows demos about active/passive image acquisition for multi-view reconstruction. The applications are real-world that use Kinect V2 (Time of Flight) sensors and software like MetaShape (used by professional game/movie developers). She also talks about a 3D reconstruction technique, called BundleFusion (Dai et. al 2017: https://dl.acm.org/doi/abs/10.1145/3072959.3054739), which was developed at FAU.
Vanessa is currently pursuing her PhD at the Chair of Visual Computing, Department of Computer Science. She has open thesis/projects related to 3D reconstruction of objects and human bodies: https://www.lgdv.tf.fau.de/person/vanessa-wirth/